-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Added support for OpenAI Text to Audio (Speech API ) #317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d234402
to
43d2390
Compare
8a79494
to
2ce21fb
Compare
8039a5f
to
4e4989c
Compare
9198015
to
eb766b1
Compare
@tzolov seems you worked on the speech API code and merged to main, should I close this PR ? |
Hi @hemeda3 , thanks for reaching out. Next I realised that that until we have at least two text-to-speech and speech-to-text client implementations from different AI vendors, it is premature to create a common model abstractions under the spring-ai-core/model. Later are meant to facilitate portability between vendors, but with a single implementation there is not enough data to decide what the common abstractions should look like. Having said this, would you be interested to re-work your PR after the refactoring i did? You will have to base your client on the OpenAiAudiApi low-level client and move the code form spring-ai-core to the spring-ai-openai .../audio/speech (e.g. next to .../audio/transcription) package? |
@tzolov, thanks for the explanation 🙏. actually I was a bit confused since both APIs ( speech + transcription) share the same OpenAI audio API at a low level, but your changes have clarified things for me. I'm happy to re-work my PR based on your updates. If I have any questions, I'll reach out. Thanks for the opportunity to contribute and learn. |
4d6a718
to
684c81a
Compare
|
b971a8e
to
3a2edc0
Compare
Hi @tzolov should I add the documentation to the same PR or new PR? |
Hi @hemeda3 , thanks for asking. |
14849b6
to
5de7b4a
Compare
Added speech API adoc:
|
|
||
private OpenAiAudioSpeechOptions speechOptions; | ||
|
||
private final List<SpeechMessage> messages; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying API only accepts a single string input, not an collection. So I think this should be ModelRequest<SpeechMessage>
and not ModelRequest<List>.
It took a while to get to, but it is now merged. Thanks, this was a great contribution! Merged as 766b420 |
https://platform.openai.com/docs/api-reference/audio/createSpeech#:~:text=Speech%20to%20text-,Create%20speech,-
how to use it :
config:
Manual options with metadata/ratelimit info and prompt style
Streaming speech Audio directly from OpenAI API